Querying Web-Sources within a Data Federation
نویسندگان
چکیده
The Web is undoubtedly the largest and most diverse repository of data, but it was not designed to offer the capabilities of traditional data base management systems – which is unfortunate. In a true data federation, all types of data sources, such as relational databases and semi-structured Web sites, could be used together. IBM WebSphere uses the “request-reply-compensate” protocol to communicate with wrappers in a data federation. This protocol expects wrappers to reply to query requests by indicating the portion of the queries they can answer. While this provides a very generic approach to data federation, it also requires the wrapper developer to deal with some of the complexities of capability considerations through custom coding. Alternative approaches based on declarative capability restrictions have been proposed in the literature, but they have not found their way into commercial systems, perhaps due to their complexity. We offer a practical middle-ground solution to querying Web-sources, using IBM’s data federation system as an example. In lieu of a two-layered architecture consisting of wrapper and source layers, we propose to move the capability declaration from the wrapper layer to a single component between the wrapper and the native data source. The advantage of this three-layered architecture is that each new Web-source only needs to register its capability with the capability-declaration component once, which saves the work of writing a new wrapper each time. Thus the inclusion of Web-sources through this mechanism can be accelerated in a way that doesn't require a change in existing data federation technology.
منابع مشابه
General Strategy for Querying Web Sources in a Data Federation Environment
Modern database management systems are supporting the inclusion and querying of nonrelational sources within a data federation environment via wrappers. Wrapper development for Web sources, however, is a convolution of code with extraction and query planning knowledge and becomes a daunting task. We use IBM DB2 federation engine to demonstrate the challenges of incorporating Web sources into a ...
متن کاملELITE: An Entailment-Based Federated Query Engine for Complete and Transparent Semantic Data Integration
In recent years the core of the semantic web has evolved into a conceptual layer built by a set of ontologies mapped onto data distributed in numerous data sources, interlinked, interpreted and processed in terms of semantics. One of the central issues in this context became the federated querying of such linked data. This paper presents the federated query engine ELITE that facilitates a compl...
متن کاملHiBISCuS: Hypergraph-Based Source Selection for SPARQL Endpoint Federation
Efficient federated query processing is of significant importance to tame the large amount of data available on the Web of Data. Previous works have focused on generating optimized query execution plans for fast result retrieval. However, devising source selection approaches beyond triple pattern-wise source selection has not received much attention. This work presents HiBISCuS, a novel hypergr...
متن کاملHypermedia-Based Discovery for Source Selection Using Low-Cost Linked Data Interfaces
Evaluating federated Linked Data queries requires consulting multiple sources on the Web. Before a client can execute queries, it must discover data sources, and determine which ones are relevant. Federated query execution research focuses on the actual execution, while data source discovery is often marginally discussed—even though it has a strong impact on selecting sources that contribute to...
متن کاملGUN: An Efficient Execution Strategy for Querying the Web of Data
Local-As-View (LAV) mediators provide a uniform interface to a federation of heterogeneous data sources to attempt the execution of queries against the federation. LAV mediators rely on query rewriters to translate mediator queries into equivalent queries on the federated data sources. The query rewriting problem in LAV mediators has shown to be NP-complete, and there may be an exponential numb...
متن کامل